Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Synth Syst Biotechnol ; 7(4): 1148-1158, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36101898

RESUMO

A parallel screening of 27 different flavonoids and chalcones was conducted using 6 artificial naringenin-activated riboswitches (M1, M2, M3, O, L and H). A quantitative structure-property relationship approach was applied to understand the physicochemical properties of the flavonoid structures resulting in specificity differences relied on the fluorescence intensity of a green fluorescent protein reporter. Robust models of riboswitches M1, M2 and O that had good predictive power were constructed with descriptors selected for their high correlation. Increased electronegativity and hydrophilicity of the flavonoids structures were identified as two properties that increased binding affinity to RNA riboswitches. Hydroxyl groups at the C-3' and C-4' positions of the flavonoid molecule were strictly required for ligand-activation with riboswitches M1 and M2. Riboswitches O and L preferred multi-hydroxylated flavones as ligands. Substitutions on the A ring of the flavonoid molecule were not important in the molecular recognition process. O-glycosylated derivatives were not recognized by any of the riboswitches, presumably due to steric hindrances. Despite the challenges of detecting RNA conformational change after ligand binding, the resulting models elucidate important physicochemical features in the ligands for conformational structural studies of artificial aptamer complexes and for design of ligands having higher binding specificity.

2.
Bioconjug Chem ; 32(9): 1984-1998, 2021 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-34384218

RESUMO

Accurate detection of doses is critical for the development of effective countermeasures and patient stratification strategies in cases of accidental exposure to ionizing radiation. Existing detection devices are limited by high fabrication costs, long processing times, need for sophisticated detection systems, and/or loss of readout signal over time, particularly in complex environments. Here, we describe fundamental studies on amino acid-facilitated templating of gold nanoparticles following exposure to ionizing radiation as a new colorimetric approach for radiation detection. Tryptophan demonstrated spontaneous nanoparticle formation, and parallel screening of a library of amino acids and related compounds led to the identification of lead candidates, including phenylalanine, which demonstrated an increase in absorbance at wavelengths typical of gold nanoparticles in the presence of ionizing radiation (X-rays). Evaluation of screening, i.e., absorbance data, in concert with chemical informatics modeling led to the elucidation of physicochemical properties, particularly polarizable regions and partial charges, that governed nanoparticle formation propensities upon exposure of amino acids to ionizing radiation. NMR spectroscopy revealed key roles of amino and carboxy moieties in determining the nanoparticle formation propensity of phenylalanine, a lead amino acid from the screen. These findings were employed for fabricating radiation-responsive amino acid nanosensor gels (RANGs) based on phenylalanine and tryptophan, and efficacy of RANGs was demonstrated for predicting clinical doses of ionizing radiation in anthropomorphic thorax phantoms and in live canine patients undergoing radiotherapy. The use of biocompatible templating ligands (amino acids), rapid response, simplicity of fabrication, efficacy, ease of operation and detection, and long-lasting readout indicate several advantages of the RANG over existing detection systems for monitoring radiation in clinical radiotherapy, radiological emergencies, and trauma care.


Assuntos
Nanopartículas Metálicas , Animais , Colorimetria , Cães , Ouro
3.
ACS Biomater Sci Eng ; 5(2): 654-669, 2019 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-33405829

RESUMO

Quantitative approaches to structure-property relationships are critical for the accelerated design and discovery of biomaterials in biotechnology and medicine. However, the absence of definitive structures, unlike those available for small molecules or 3D crystal structures available for some proteins, has limited the development of Quantitative Structure-Property Relationship (QSPR) models for investigating physicochemical properties and biological activity of polymers. In this study, we describe a combined experimental and cheminformatics paradigm for first developing QSPR models of polymer physicochemical properties, including molecular weight, hydrophobicity, and DNA-binding activity. Quantitative Structure-Activity Relationship (QSAR) models of polymer-mediated transgene expression were then developed using these physicochemical properties with an eye towards developing a novel two-step chemical informatics paradigm for determining biological activity (e.g., transgene expression) of polymer properties as related to physicochemical properties. We also investigated a more conventional approach in which biomaterial efficacy, i.e., transgene expression activity, was directly correlated to structural representations of the polymers used for delivering plasmid DNA. Our generalized chemical informatics approach can accelerate the discovery of polymeric biomaterials for several applications in biotechnology and medicine, including in nucleic acid delivery.

4.
J Am Soc Mass Spectrom ; 28(6): 1013-1020, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28361384

RESUMO

The effects of oxygen addition on a helium-based flowing atmospheric pressure afterglow (FAPA) ionization source are explored. Small amounts of oxygen doped into the helium discharge gas resulted in an increase in abundance of protonated water clusters by at least three times. A corresponding increase in protonated analyte signal was also observed for small polar analytes, such as methanol and acetone. Meanwhile, most other reagent ions (e.g., O2+·, NO+, etc.) significantly decrease in abundance with even 0.1% v/v oxygen in the discharge gas. Interestingly, when analytes that contained aromatic constituents were subjected to a He:O2-FAPA, a unique (M + 3)+ ion resulted, while molecular or protonated molecular ions were rarely detected. Exact-mass measurements revealed that these (M + 3)+ ions correspond to (M - CH + O)+, with the most likely structure being pyrylium. Presence of pyrylium-based ions was further confirmed by tandem mass spectrometry of the (M + 3)+ ion compared with that of a commercially available salt. Lastly, rapid and efficient production of pyrylium in the gas phase was used to convert benzene into pyridine. Though this pyrylium-formation reaction has not been shown before, the reaction is rapid and efficient. Potential reactant species, which could lead to pyrylium formation, were determined from reagent-ion mass spectra. Thermodynamic evaluation of reaction pathways was aided by calculation of the formation enthalpy for pyrylium, which was found to be 689.8 kJ/mol. Based on these results, we propose that this reaction is initiated by ionized ozone (O3+·), proceeds similarly to ozonolysis, and results in the neutral loss of the stable CHO2· radical. Graphical Abstract ᅟ.

5.
J Colloid Interface Sci ; 495: 130-139, 2017 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-28193511

RESUMO

Adding nano-sized fillers to epoxy has proven to be an effective method for improving dielectric breakdown strength (DBS). Evidence suggests that dispersion state, as well as chemistry at the filler-matrix interface can play a crucial role in property enhancement. Herein we investigate the contribution of both filler dispersion and surface chemistry on the AC dielectric breakdown strength of silica-epoxy nanocomposites. Ligand engineering was used to synthesize bimodal ligands onto 15nm silica nanoparticles consisting of long epoxy compatible, poly(glycidyl methacrylate) (PGMA) chains, and short, π-conjugated, electroactive surface ligands. Surface initiated RAFT polymerization was used to synthesize multiple graft densities of PGMA chains, ultimately controlling the dispersion of the filler. Thiophene, anthracene, and terthiophene were employed as π-conjugated surface ligands that act as electron traps to mitigate avalanche breakdown. Investigation of the synthesized multifunctional nanoparticles was effective in defining the maximum particle spacing or free space length (Lf) that still leads to property enhancement, as well as giving insight into the effects of varying the electronic nature of the molecules at the interface on breakdown strength. Optimization of the investigated variables was shown to increase the AC dielectric breakdown strength of epoxy composites as much as 34% with only 2wt% silica loading.

6.
Artigo em Inglês | MEDLINE | ID: mdl-28031013

RESUMO

OBJECTIVE: Support Vector Regression (SVR) has become increasingly popular in cheminformatics modeling. As a result, SVR-based machine learning algorithms, including Fuzzy-SVR and Least Square-SVR (LS-SVR) have been developed and applied in various research areas. However, at present, few downloadable packages or public-domain software are available for these algorithms. To address this need, we developed the Support vector regression-based Online Learning Equipment (SOLE) web tool (available at http://reccr.chem.rpi.edu/SOLE/index.html) as an online learning system to support predictive cheminformatics and materials informatics studies. RESULTS: In this work, we employed the SOLE system to model transgene expression efficacy of polymers obtained from aminoglycoside antibiotics, which allowed the results of several modeling approaches to be easily compared. All models had test set r2 of 0.96-0.98 and test set R2 of 0.79-0.84. Y-scrambling test showed the models were stable and not over-fitted. CONCLUSION: SOLE has a user-friendly interface and includes routine elements of performing QSAR/QSPR studies that can be applied in various research areas. It utilizes rational and sophisticated feature selection, model selection and model evaluation processes.


Assuntos
Aminoglicosídeos/química , Aprendizado de Máquina , Poliaminas/química , Software , Transfecção , Transgenes , Algoritmos , Antibacterianos/química , Linhagem Celular Tumoral , Humanos , Análise dos Mínimos Quadrados , Modelos Biológicos , Polieletrólitos , Relação Quantitativa Estrutura-Atividade , Análise de Regressão , Transfecção/métodos
7.
J Phys Condens Matter ; 28(32): 325502, 2016 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-27324304

RESUMO

We report simulations based on density functional theory and many-body perturbation theory exploring the band gaps of common crystalline polymers including polyethylene, polypropylene and polystyrene. Our reported band gaps of 8.6 eV for single-chain polyethylene and 9.1 eV for bulk crystalline polyethylene are in excellent agreement with experiment. The effects of chemical doping along the polymer backbone and side-groups are explored, and the use mechanical strain as a means to modify the band gaps of these polymers over a range of several eV while leaving the dielectric constant unchanged is discussed. This work highlights some of the opportunities available to engineer the electronic properties of polymers with wide-reaching implications for polymeric dielectric materials used for capacitive energy storage.

8.
ACS Biomater Sci Eng ; 1(8): 656-668, 2015 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-33435089

RESUMO

We describe the parallel synthesis of lipopolymers generated by conjugating alkanoyl chlorides to polymers derived from aminoglycoside antibiotic monomers as novel vehicles for transgene delivery and expression in mammalian cells. Parallel screening of lipopolymers led to the identification of six leads that demonstrated higher transgene expression efficacies in several cancer cells, when compared to the parental polymers as well as 25 kDa poly(ethylene imine), a current standard for polymer-mediated transgene expression. Quantitiative structure-activity relationship (QSAR)-based cheminformatics modeling was employed in order to investigate the role of lipopolymer physicochemical properties (molecular descriptors) on transgene expression efficacy. The predictive ability of the QSAR model, investgated using lipopolymers not employed for training the model, demonstrated excellent agreement with experimentally observed transgene expression. Our findings indicate that lipid substitution on aminoglycoside-derived polymers results in high levels of transgene expression compared to unsubstituted polymers. Taken together, these materials show significant promise in nonviral transgene delivery with several applications in biotechnology and medicine.

9.
PLoS One ; 9(3): e93108, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24667334

RESUMO

AIDS is a global pandemic that has seen the development of novel and effective treatments to improve the quality of life of those infected and reduction of spread of the disease. Palmitic Acid (PA), which we identified and isolated from Sargassum fusiforme, is a naturally occurring fatty acid that specifically inhibits HIV entry by binding to a novel pocket on the CD4 receptor. We also identified a structural analogue, 2-bromopalmitate (2-BP), as a more effective HIV entry inhibitor with a 20-fold increase in efficacy. We have used the structure-activity relationship (SAR) of 2-BP as a platform to identify new small chemical molecules that fit into the various identified active sites in an effort to identify more potent CD4 entry inhibitors. To validate further drug development, we tested the PA and 2-BP scaffold molecules for genotoxic potential. The FDA and International Conference on Harmonisation (ICH) recommends using a standardized 3-test battery for testing compound genotoxicity consisting of the bacterial reverse mutation assay, mouse lymphoma assay, and rat micronucleus assay. PA and 2-BP and their metabolites tested negative in all three genotoxicty tests. 2-BP is the first derivative of PA to undergo pre-clinical screening, which will enable us to now test multiple simultaneous small chemical structures based on activity in scaffold modeling across the dimension of pre-clinical testing to enable transition to human testing.


Assuntos
Produtos Biológicos/química , Produtos Biológicos/toxicidade , Inibidores da Fusão de HIV/química , Inibidores da Fusão de HIV/toxicidade , HIV/efeitos dos fármacos , HIV/fisiologia , Internalização do Vírus/efeitos dos fármacos , Animais , Produtos Biológicos/farmacologia , Descoberta de Drogas , Feminino , Inibidores da Fusão de HIV/farmacologia , Linfoma/patologia , Masculino , Camundongos , Testes para Micronúcleos , Palmitatos/química , Palmitatos/farmacologia , Palmitatos/toxicidade , Ácido Palmítico/química , Ácido Palmítico/farmacologia , Ácido Palmítico/toxicidade , Ratos , Salmonella typhimurium/efeitos dos fármacos , Salmonella typhimurium/genética , Relação Estrutura-Atividade
10.
Biomaterials ; 35(6): 1977-88, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24331709

RESUMO

We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure-Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and 'building block' polymer structures. The QSAR model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based QSAR models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology.


Assuntos
Antibacterianos/química , Informática , Modelos Químicos , Polímeros/química , Aminoglicosídeos/química , Técnicas de Química Combinatória , Técnicas de Transferência de Genes , Relação Quantitativa Estrutura-Atividade , Máquina de Vetores de Suporte
11.
J Chem Inf Model ; 53(12): 3352-66, 2013 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-24261543

RESUMO

Computational methods that can identify CYP-mediated sites of metabolism (SOMs) of drug-like compounds have become required tools for early stage lead optimization. In recent years, methods that combine CYP binding site features with CYP/ligand binding information have been sought in order to increase the prediction accuracy of such hybrid models over those that use only one representation. Two challenges that any hybrid ligand/structure-based method must overcome are (1) identification of the best binding pose for a specific ligand with a given CYP and (2) appropriately incorporating the results of docking with ligand reactivity. To address these challenges we have created Docking-Regioselectivity-Predictor (DR-Predictor)--a method that incorporates flexible docking-derived information with specialized electronic reactivity and multiple-instance-learning methods to predict CYP-mediated SOMs. In this study, the hybrid ligand-structure-based DR-Predictor method was tested on substrate sets for CYP 1A2 and CYP 2A6. For these data, the DR-Predictor model was found to identify the experimentally observed SOM within the top two predicted rank-positions for 86% of the 261 1A2 substrates and 83% of the 100 2A6 substrates. Given the accuracy and extendibility of the DR-Predictor method, we anticipate that it will further facilitate the prediction of CYP metabolism liabilities and aid in in-silico ADMET assessment of novel structures.


Assuntos
Inteligência Artificial , Hidrocarboneto de Aril Hidroxilases/química , Citocromo P-450 CYP1A2/química , Simulação de Acoplamento Molecular , Bibliotecas de Moléculas Pequenas/química , Hidrocarboneto de Aril Hidroxilases/metabolismo , Biotransformação , Domínio Catalítico , Citocromo P-450 CYP1A2/metabolismo , Citocromo P-450 CYP2A6 , Humanos , Ligação de Hidrogênio , Interações Hidrofóbicas e Hidrofílicas , Ligantes , Ligação Proteica , Bibliotecas de Moléculas Pequenas/metabolismo , Relação Estrutura-Atividade , Especificidade por Substrato , Termodinâmica
12.
Adv Funct Mater ; 23(46): 5746-5752, 2013 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-27524957

RESUMO

Accelerated insertion of nanocomposites into advanced applications is predicated on the ability to perform a priori property predictions on the resulting materials. In this paper, a paradigm for the virtual design of spherical nanoparticle-filled polymers is demonstrated. A key component of this "Materials Genomics" approach is the development and use of Materials Quantitative Structure-Property Relationship (MQSPR) models trained on atomic-level features of nanofiller and polymer constituents and used to predict the polar and dispersive components of their surface energies. Surface energy differences are then correlated with the nanofiller dispersion morphology and filler/matrix interface properties and integrated into a numerical analysis approach that allows the prediction of thermomechanical properties of the spherical nanofilled polymer composites. Systematic experimental studies of silica nanoparticles modified with three different surface chemistries in polystyrene (PS), poly(methyl methacrylate) (PMMA), poly(ethyl methacrylate) (PEMA) and poly(2-vinyl pyridine) (P2VP) are used to validate the models. While demonstrated here as effective for the prediction of meso-scale morphologies and macro-scale properties under quasi-equilibrium processing conditions, the protocol has far ranging implications for Virtual Design.

13.
Bioinformatics ; 29(4): 497-8, 2013 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-23242264

RESUMO

SUMMARY: Regioselectivity-WebPredictor (RS-WebPredictor) is a server that predicts isozyme-specific cytochrome P450 (CYP)-mediated sites of metabolism (SOMs) on drug-like molecules. Predictions may be made for the promiscuous 2C9, 2D6 and 3A4 CYP isozymes, as well as CYPs 1A2, 2A6, 2B6, 2C8, 2C19 and 2E1. RS-WebPredictor is the first freely accessible server that predicts the regioselectivity of the last six isozymes. Server execution time is fast, taking on average 2s to encode a submitted molecule and 1s to apply a given model, allowing for high-throughput use in lead optimization projects. AVAILABILITY: RS-WebPredictor is accessible for free use at http://reccr.chem.rpi.edu/Software/RS-WebPredictor/


Assuntos
Sistema Enzimático do Citocromo P-450/metabolismo , Software , Algoritmos , Cinarizina/química , Cinarizina/metabolismo , Isoenzimas/metabolismo
14.
Chem Senses ; 37(8): 723-36, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22824250

RESUMO

A list of 147 tetralin- and indan-like compounds was compiled from the literature for investigating the relationship between molecular structure and musk odor. Each compound in the data set was represented by 374 CODESSA and 970 TAE descriptors. A genetic algorithm (GA) for pattern recognition analysis was used to identify a subset of molecular descriptors that could differentiate musks from nonmusks in a plot of the two largest principal components (PCs) of the data. A PC map of the 110 compounds in the training set using 45 molecular descriptors identified by the pattern recognition GA revealed an asymmetric data structure. Tetralin and indan musks were found to occupy a small, but well-defined region of the PC (descriptor) space, with the nonmusks randomly distributed in the PC plot. A three-layer feed-forward neural network trained by back propagation was used to develop a discriminant that correctly classified all the compounds in the training set as musk or nonmusk. The neural network was successfully validated using an external prediction of 37 compounds.


Assuntos
Indanos/química , Odorantes/análise , Tetra-Hidronaftalenos/química , Algoritmos , Bases de Dados Factuais , Estrutura Molecular
15.
J Chem Inf Model ; 52(6): 1637-59, 2012 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-22524152

RESUMO

RS-Predictor is a tool for creating pathway-independent, isozyme-specific, site of metabolism (SOM) prediction models using any set of known cytochrome P450 (CYP) substrates and metabolites. Until now, the RS-Predictor method was only trained and validated on CYP 3A4 data, but in the present study, we report on the versatility the RS-Predictor modeling paradigm by creating and testing regioselectivity models for substrates of the nine most important CYP isozymes. Through curation of source literature, we have assembled 680 substrates distributed among CYPs 1A2, 2A6, 2B6, 2C19, 2C8, 2C9, 2D6, 2E1, and 3A4, the largest publicly accessible collection of P450 ligands and metabolites released to date. A comprehensive investigation into the importance of different descriptor classes for identifying the regioselectivity mediated by each isozyme is made through the generation of multiple independent RS-Predictor models for each set of isozyme substrates. Two of these models include a density functional theory (DFT) reactivity descriptor derived from SMARTCyp. Optimal combinations of RS-Predictor and SMARTCyp are shown to have stronger performance than either method alone, while also exceeding the accuracy of the commercial regioselectivity prediction methods distributed by Optibrium and Schrödinger, correctly identifying a large proportion of the metabolites in each substrate set within the top two rank-positions: 1A2 (83.0%), 2A6 (85.7%), 2B6 (82.1%), 2C19 (86.2%), 2C8 (83.8%), 2C9 (84.5%), 2D6 (85.9%), 2E1 (82.8%), 3A4 (82.3%), and merged (86.0%). Comprehensive datamining of each substrate set and careful statistical analyses of the predictions made by the different models revealed new insights into molecular features that control metabolic regioselectivity and enable accurate prospective prediction of likely SOMs.


Assuntos
Sistema Enzimático do Citocromo P-450/metabolismo , Isoenzimas/metabolismo , Especificidade por Substrato
16.
IEEE Trans Pattern Anal Mach Intell ; 34(6): 1068-79, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21987558

RESUMO

We present a bundle algorithm for multiple-instance classification and ranking. These frameworks yield improved models on many problems possessing special structure. Multiple-instance loss functions are typically nonsmooth and nonconvex, and current algorithms convert these to smooth nonconvex optimization problems that are solved iteratively. Inspired by the latest linear-time subgradient-based methods for support vector machines, we optimize the objective directly using a nonconvex bundle method. Computational results show this method is linearly scalable, while not sacrificing generalization accuracy, permitting modeling on new and larger data sets in computational chemistry and other applications. This new implementation facilitates modeling with kernels.


Assuntos
Algoritmos , Reconhecimento Automatizado de Padrão/métodos , Inteligência Artificial , Humanos , Redes Neurais de Computação , Máquina de Vetores de Suporte
17.
J Chem Inf Model ; 51(11): 2808-20, 2011 Nov 28.
Artigo em Inglês | MEDLINE | ID: mdl-21999408

RESUMO

Least-squares fitting of the Hill equation to quantitative high-throughput screening (qHTS) assays results in frequent unsatisfactory fits. We learn and exploit prior knowledge to improve the Hill fitting in a nonlinear regression method called domain knowledge fitter (DK-fitter). This paper formulates and solves DK-fitter for 44 public qHTS data sets. This new Hill parameter estimation technique is validated using three unbiased approaches, including a novel method that involves generating simulated samples. This paper fosters the extraction of higher quality information from screens for improved potency evaluation.


Assuntos
Biologia Computacional/métodos , Ensaios de Triagem em Larga Escala , Modelos Químicos , Biologia Computacional/estatística & dados numéricos , Desenho de Fármacos , Inibidores Enzimáticos/farmacologia , Piruvato Quinase/antagonistas & inibidores , Piruvato Quinase/metabolismo , Relação Quantitativa Estrutura-Atividade , Análise de Regressão
18.
J Chem Inf Model ; 51(9): 2302-19, 2011 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-21875058

RESUMO

The use of Quantitative Structure-Activity Relationship models to address problems in drug discovery has a mixed history, generally resulting from the misapplication of QSAR models that were either poorly constructed or used outside of their domains of applicability. This situation has motivated the development of a variety of model performance metrics (r(2), PRESS r(2), F-tests, etc.) designed to increase user confidence in the validity of QSAR predictions. In a typical workflow scenario, QSAR models are created and validated on training sets of molecules using metrics such as Leave-One-Out or many-fold cross-validation methods that attempt to assess their internal consistency. However, few current validation methods are designed to directly address the stability of QSAR predictions in response to changes in the information content of the training set. Since the main purpose of QSAR is to quickly and accurately estimate a property of interest for an untested set of molecules, it makes sense to have a means at hand to correctly set user expectations of model performance. In fact, the numerical value of a molecular prediction is often less important to the end user than knowing the rank order of that set of molecules according to their predicted end point values. Consequently, a means for characterizing the stability of predicted rank order is an important component of predictive QSAR. Unfortunately, none of the many validation metrics currently available directly measure the stability of rank order prediction, making the development of an additional metric that can quantify model stability a high priority. To address this need, this work examines the stabilities of QSAR rank order models created from representative data sets, descriptor sets, and modeling methods that were then assessed using Kendall Tau as a rank order metric, upon which the Shannon entropy was evaluated as a means of quantifying rank-order stability. Random removal of data from the training set, also known as Data Truncation Analysis (DTA), was used as a means for systematically reducing the information content of each training set while examining both rank order performance and rank order stability in the face of training set data loss. The premise for DTA ROE model evaluation is that the response of a model to incremental loss of training information will be indicative of the quality and sufficiency of its training set, learning method, and descriptor types to cover a particular domain of applicability. This process is termed a "rank order entropy" evaluation or ROE. By analogy with information theory, an unstable rank order model displays a high level of implicit entropy, while a QSAR rank order model which remains nearly unchanged during training set reductions would show low entropy. In this work, the ROE metric was applied to 71 data sets of different sizes and was found to reveal more information about the behavior of the models than traditional metrics alone. Stable, or consistently performing models, did not necessarily predict rank order well. Models that performed well in rank order did not necessarily perform well in traditional metrics. In the end, it was shown that ROE metrics suggested that some QSAR models that are typically used should be discarded. ROE evaluation helps to discern which combinations of data set, descriptor set, and modeling methods lead to usable models in prioritization schemes and provides confidence in the use of a particular model within a specific domain of applicability.


Assuntos
Entropia , Modelos Moleculares , Relação Quantitativa Estrutura-Atividade
19.
J Chem Inf Model ; 51(7): 1667-89, 2011 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-21528931

RESUMO

This article describes RegioSelectivity-Predictor (RS-Predictor), a new in silico method for generating predictive models of P450-mediated metabolism for drug-like compounds. Within this method, potential sites of metabolism (SOMs) are represented as "metabolophores": A concept that describes the hierarchical combination of topological and quantum chemical descriptors needed to represent the reactivity of potential metabolic reaction sites. RS-Predictor modeling involves the use of metabolophore descriptors together with multiple-instance ranking (MIRank) to generate an optimized descriptor weight vector that encodes regioselectivity trends across all cases in a training set. The resulting pathway-independent (O-dealkylation vs N-oxidation vs Csp(3) hydroxylation, etc.), isozyme-specific regioselectivity model may be used to predict potential metabolic liabilities. In the present work, cross-validated RS-Predictor models were generated for a set of 394 substrates of CYP 3A4 as a proof-of-principle for the method. Rank aggregation was then employed to merge independently generated predictions for each substrate into a single consensus prediction. The resulting consensus RS-Predictor models were shown to reliably identify at least one observed site of metabolism in the top two rank-positions on 78% of the substrates. Comparisons between RS-Predictor and previously described regioselectivity prediction methods reveal new insights into how in silico metabolite prediction methods should be compared.


Assuntos
Citocromo P-450 CYP3A , Modelos Moleculares , Acetaminofen/química , Acetaminofen/metabolismo , Sítios de Ligação , Citocromo P-450 CYP3A/química , Citocromo P-450 CYP3A/metabolismo , Estrutura Molecular , Estereoisomerismo , Varfarina/química , Varfarina/metabolismo
20.
Mol Inform ; 30(9): 765-77, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-27467409

RESUMO

Making suitable modeling choices is crucial for successful in silico drug design, and one of the most important of these is the proper extraction and curation of data from qHTS screens, and the use of optimized statistical learning methods to obtain valid models. More specifically, we aim to learn the top-1 % most potent compounds against a variety of targets in a procedure we call virtual screening hit identification (VISHID). To do so, we exploit quantitative high-throughput screens (qHTS) obtained from PubChem, descriptors derived from molecular structures, and support vector machines (SVM) for model generation. Our results illustrate how an appreciation of subtle issues underlying qHTS data extraction and the resulting SVM models created using these data can enhance the effectiveness of solutions and, in doing so, accelerate drug discovery.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...